Skip to content

Conversation

@n0228a
Copy link

@n0228a n0228a commented Oct 8, 2025

This PR adds collocated cokriging methods to GSTools for multivariate geostatistical estimation, allowing users to improve sparse primary variable estimates by leveraging densely-sampled secondary variable data.

Collocated Cokriging Features

  • Two cokriging variants: Simple Collocated (SCCK) and Intrinsic Collocated (ICCK)
  • Inherits from Krige base class for full API compatibility
  • Implements Markov Model I (MM1): C_YZ(h) = ρ_YZ(0)·√(C_Z(h)·C_Y(h))
  • Accessible via gstools.cokriging submodule or top-level: from gstools import SimpleCollocated, IntrinsicCollocated
  • Full variance estimation with return_var=True
  • Parameter validation: cross_corr ∈ [-1, 1], secondary_var > 0
  • Boundary conditions: ρ=0 recovers Simple Kriging, ρ=±1 gives zero ICCK variance
  • SCCK can exhibit variance inflation (σ²_SCCK ≥ σ²_SK) in unstable configurations
  • ICCK provides stable variance: σ²_ICCK = (1-ρ₀²)·σ²_SK ≤ σ²_SK

Simple Collocated Cokriging

Uses only collocated secondary data at the estimation point:

import gstools as gs
model = gs.Gaussian(dim=1, var=0.5, len_scale=2)
scck = gs.SimpleCollocated(
    model, cond_pos, cond_val,
    cross_corr=0.8, secondary_var=1.5,
    mean=1.0, secondary_mean=0.5
)
field, var = scck(pos, secondary_data=sec_data, return_var=True)

Intrinsic Collocated Cokriging

Uses collocated secondary data plus secondary values at all primary locations for more stable variance:

icck = gs.IntrinsicCollocated(
    model, cond_pos, cond_val,
    sec_cond_pos, sec_cond_val,
    cross_corr=0.8, secondary_var=1.5,
    mean=1.0, secondary_mean=0.5
)
field, var = icck(pos, secondary_data=sec_data, return_var=True)

Implementation Details

  • Located in new gstools.cokriging submodule with base class and methods
  • CollocatedCokriging base class handles MM1 formulation and variance computation
  • SimpleCollocated implements SCCK estimator: Z*_SCCK = Z*_SK·(1-k·λ_Y0) + λ_Y0·(Y(u0)-m_Y) + k·λ_Y0·m_Z
  • IntrinsicCollocated implements ICCK estimator using secondary weights
  • Both methods require secondary_data parameter on call
  • Cross-covariance computed as: C_YZ(0) = ρ_YZ(0)·√(C_Z(0)·C_Y(0))
  • Inherits all Krige functionality: grids, models, dimensions, anisotropy, drift, normalizers

Tests

11 tests in test_cokriging.py testing only cokriging logic:

  • Parameter validation (secondary_data required, cross_corr bounds, secondary_var positive, ICCK data length)
  • Boundary conditions (ρ=0 equals SK, ρ=±1 near-zero ICCK variance)
  • Variance formulas (SCCK and ICCK mathematical correctness)
  • Edge cases (SCCK variance inflation, SCCK vs ICCK stability comparison)
  • Exact Interpolation at conditioning points

All tests passing with no warnings.

Examples

Two minimal examples following GSTools conventions (76 & 78 lines):

  • 10_simple_collocated_cokriging.py - SCCK demonstration
  • 11_intrinsic_collocated_cokriging.py - ICCK demonstration
  • Same data with 2 spatial gap features demonstrating cokriging value
  • Two-panel plots showing raw data (left) and SK vs cokriging comparison (right)
  • Synthetic correlated secondary data for reproducibility

References

n0228a added 16 commits August 26, 2025 00:41
- Implement Simple Collocated Cokriging (SCCK) extending Krige class
- Implement Intrinsic Collocated Cokriging (ICCK) with flexible secondary models
- Add comprehensive test suite with 14 test cases covering:
  - Matrix construction and dimensions
  - Cross-correlation validation
  - RHS vector structure
  - Integration with drift functions
  - Edge cases (zero/perfect correlation)
- Follow gstools patterns: property validation, error handling, documentation
- Matrix structure: (n+1) x (n+1) for n conditioning points + 1 secondary variable
- Uses Markov model assumption: C_zy(h) = ρ * √(C_zz(h) * C_yy(h))
- All tests passing with proper position handling via pre_pos method
- Clean minimal implementation extending Krige base class
- Follows existing gstools design patterns exactly
- Only adds cross_corr parameter and secondary_data requirement
- Uses (n+1)×(n+1) matrix system solved per estimation point
- Full API compatibility: return_var, chunk_size, only_mean, etc.
- Proper integration with gstools post-processing and chunking
- Zero cross-correlation equals Simple Kriging (verified)
- Located in separate cokriging module as requested
- Create CollocatedCokriging base class following kriging module pattern
- Refactor SCCK and ICCK as thin wrappers (algorithm='MM1' vs 'intrinsic')
- Eliminate ~400 lines of duplicated code
- Maintain full backward compatibility
- All tests passing (14/14)
- Cleaner architecture for future extensibility
@MuellerSeb
Copy link
Member

This is great! I love it. I wanted this for a long time but never found the time to work on it.

I have some general questions/remarks:

  1. What about creating a Correlogram (abstract)-base-class with the MarkovModel1 as one child that holds all the logic currently added with several keywords in CollocatedCokriging.
    I think it could look like this:
import gstools as gs

correlo_model = gs.MarkovModel1(
    model=gs.Gaussian(dim=1, var=0.5, len_scale=2),
    cross_corr=0.8,
    secondary_var=1.5,
)
scck = gs.SimpleCollocated(
    correlo_model, cond_pos, cond_val,
    mean=1.0, secondary_mean=0.5
)

Maybe the means could be combined in a list then, or we could also move them into the MarkovModel1 (not sure what makes more sense here), or keep them as is.
We would then also need to think about which functionality we move into the Correlogram class. (e.g. _compute_covariances)
As always my motivation for this would be, to be future prove (for MM2 and maybe other possible models).

  1. How does/should Cokriging work with normalizer, fit_normalizer and fit_variogram?

  2. Do we also need an estimator for the empirical correlogram?

  3. please use math environments in the documentation for formuals.

  4. There should be a _apply_intrinsic_collocated method like the _apply_simple_collocated method.

@n0228a
Copy link
Author

n0228a commented Oct 14, 2025

1: I'm not sure if this will work very well with the approach I chose. As you maybe have seen, after some linear algebra reformulations of the problem, SCCK and ICCK can be shown as an extension to simple Kriging, making it not necessary to construct new matrices, that would need to be solved for every estimation point in collocated cokriging. That is also where the normalizer goes currently., it applies in the way it would apply for Simple Kriging.
3. As for the Markov Model 1 so far, I don't think a cmpirical correlogramm would be necessary, for mm2 also not as I understand the theory (there you would need to fit to the primary variogram again based on your secondary variable and the correlation coefficient).
4. I will update that.
5. I will update that too.

@LSchueler LSchueler added the enhancement New feature or request label Oct 15, 2025
@LSchueler LSchueler added this to the v1.8 milestone Oct 15, 2025
@LSchueler
Copy link
Member

LSchueler commented Oct 16, 2025

Hi, I also really like your work, thanks a lot!

I suggest that we have a chat about @MuellerSeb's suggestions. The Correlogram base class he is talking about would ensure that we can later add more cokriging methods without breaking the interface.

The source checks fail. You have to run a linter on your code, see the instructions at the end of this file.

I also think that cokriging is missing after line 148 of src/gstools/__init__.py.

Depending on the time you still have for this work, I think an example with 2d data would be really valuable, as I guess most applications will be 2d and many of our users like to have very specific examples.

@n0228a
Copy link
Author

n0228a commented Oct 16, 2025

Hi,
I updated the documentation and added the apply_intrinsic method. The linter wants to sort by alphabet, but that raises import errors, the only fix I found is to import Cokrige after Krige, because it inherits from Krige. Maybe you know something.

This commit introduces a new Correlogram base class architecture that
makes collocated cokriging future-proof and extensible for different
cross-covariance models (MM1, MM2, etc.).

**New Features:**

- Added Correlogram abstract base class defining the interface for
  cross-covariance models
- Implemented MarkovModel1 as the first concrete correlogram,
  encapsulating Markov Model I assumptions
- Correlogram objects now hold all cross-covariance parameters:
  primary_model, cross_corr, secondary_var, primary_mean, secondary_mean

**API Changes:**

New (recommended) API:
  correlogram = gs.MarkovModel1(
      primary_model=model,
      cross_corr=0.8,
      secondary_var=1.5,
      primary_mean=1.0,
      secondary_mean=0.5
  )
  scck = gs.SimpleCollocated(correlogram, cond_pos, cond_val)

Backward compatibility via from_parameters() classmethod (deprecated):
  scck = gs.SimpleCollocated.from_parameters(
      model, cond_pos, cond_val,
      cross_corr=0.8, secondary_var=1.5,
      mean=1.0, secondary_mean=0.5
  )

**Refactored Classes:**

- CollocatedCokriging: Now accepts correlogram object instead of
  individual parameters (cross_corr, secondary_var, etc.)
- SimpleCollocated: Updated to use new API with backward compatibility
- IntrinsicCollocated: Updated to use new API with backward compatibility
- Both classes delegate covariance computation to correlogram

**Benefits:**

- Separation of concerns: Cross-covariance modeling separated from
  kriging algorithm
- Extensible: Easy to add MM2, Linear Model of Coregionalization, etc.
- Self-documenting: Explicit about which cross-covariance model is used
- Maintainable: Correlogram classes can be tested independently
- Future-proof: Ready for additional correlogram models

**Testing:**

- Added comprehensive test suite (test_correlogram.py)
- All tests pass with numerical equivalence between old and new API
- Updated examples to demonstrate new API

**Documentation:**

- Updated examples/05_kriging/10_simple_collocated_cokriging.py
- Updated examples/05_kriging/11_intrinsic_collocated_cokriging.py
- Added MarkovModel1 to top-level exports
- Comprehensive docstrings with usage examples

**Future Work:**

- Placeholder for MarkovModel2 implementation
- Potential for other correlogram models (intrinsic correlation, etc.)

Closes: #correlogram-architecture
- Explains new Correlogram base class design
- Provides usage examples for MarkovModel1
- Shows how to implement MarkovModel2 (future)
- Includes migration guide from old to new API
- Documents testing and file structure
@LSchueler
Copy link
Member

I think there are no import problems, the linter is just too stupid.
You can change line 148 in src/gstools/__init__.py to

from gstools import (  # noqa: I001

put cokriging right after kriging and the linter should stop complaining.

@n0228a
Copy link
Author

n0228a commented Oct 28, 2025

I tried to trace the circular import error and it occurs because krige needs field, it then initializes field with the submodule cond_srf and this needs again krige. I still don't understand why this doesn't happened before.
krige → field → cond_srf → krige
I'm not sure what the preferred solution here would be or just keep as is and put cokrige in different position?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants